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ABSTRACT 

This research examined the results from direct and 
indirect writing ass^-ssmehts to determine the most effective method 
of discrimiriatipri^ The New Jersey State Department of Education 
developed a test for ninth-grade students which was designed to 
measure the ability to apply writing mechanics to written text and 
communicate effectively in writing. The instrument combined direct 
and indirect assessment in a 54-item multiple choice section and a 
30-minute essay. This minimum competency test measured ihihimum 
writing skills. Essays were hblistically scored. Direct writing 
assessment requires writing samples by examinees to be read and 
scored by examiners. Indirect assessment requires examinees to 
respond^to items wh^ch measure correlates of writing. Both methods 
are reliable assessments. In states which mandate that students 
a writing test as part of the requirements for receiving a high 
school diploma, the important criterion is which form of assessment 
discriminates best between competent and incompetent writers. Results 
of statistical analysis indicated the indirect assessment 
better means of discrimination between competent and 

writers^ However, a combination of both methods, 7 ^ _ 

total test score, is considered the most appropriate method. (DWH) 
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_HBBe55irig St. ucerit Writ iric Skills: 
P Corn Dar i Bori of D i rect & I nc i r ect i^iet n bd b 



St epHen L. Kof f 1 
New wTersey State Deoartfnerit of Education 



The asBeBBrnent of BtuderitB' writing BkiiiB has Become a 
focal point for statewide teBtirfO programs dui-^ing the past few 
years, fis a result^ administrators of large-scale testing prog-- 
rams are QY^appli'nc with a variety of rew problems which rhust be 



resolved and which were not encountered when reading and mathema- 
tics programs were deveiopedi 

Qf orirnary concern is "how do we measure students' writing 
sKills?" There are two distinct methods -*~di_reet. aeeeesment 're- 
quire:* that writing samples be written by eKaminees and read and 



scored by readers. llQdi_rv3ct assessment requires tha enaminee to 
respond to items which measure the correlates of writing, usual iy 
in a multiple choice format, rather than doing actual writing^ 
Certainly^ there are advantages and disadvantages to each 
method- Noyes, Sale and Stainaker <i945) argued for indirect 



assessment : 

the good candidate who errs on a -"^ew of these Emuitipie 
choice itemsH has plenty of opportunity to redeem himself; a 
mistake on one iterri does not affect any other item. 2n writirig a 
theme, however^ the candidate who makes a.false start almost 
inevitably involves his whole theme in difficulties even though 
he may be, general ly speaking, a good writer. 



Br^l and < 1977) expoused the case for direct assessment: 

Clearly, writing involves much more than constructing sen- 
tences. Writing requires a sense for organization of sentences 
and _ paragraphs, the oroper use of supporting detail and the 
a D i 1 i t y t b distinguish fact from opirti on— ^ a rnon g pt h er 1 1^. i n ^ s . 
'■^ost -n u i t i p 1 e ch bice tests make no b st ens i b 1 e a 1 1 empt t o -Vie as ur^e 
tnese other important aspects of writing. There-^bre, how oan 
multiple choice tests be of much value at all in assessing 



ability, it is oeiievsa that some persons sirnbiy co r.ot res3onc 
well to tne multiple choicB moce of te=;tinE. 

Diederich (1974) argued that recuirji'-fD stucerit= to orocuo^ 
actual 5 am pies c« f wr i t i r< g , e s pec i a 1 1 y under t est c one i t i on = . i «= 
the most convincing test of their writing aoility- -Hccording to 
Spancel ^ St i eg ins < 1 9S£) , because indirect methods measure the 
prerequisites of effi5tive writing — understanding the oas ic 
elements anc conventions of itandarc SngliEh usa^e — tney rsore^ 
sent necessary but not sufficient components of writinc skills- 

Spandel & Stiggins suggested that if resources anc expertise 
were avaiJabie to specify the skills to be assessed, cevelop tne 
exercises, train the readers, and conduct at least two indepen- 
dent readings of the exercises, then direct methods would provide 
the most appropriate riieans for generating valid and reliable 
information about writing skills. However^ Quellmalz, Capeii & 
Chou(l98£) showed that levels of performance varied on different 
types of writing tasks- They implied from this that writing for 
different purposes and audiences draws on different skills and 
that those skills must be measured separately. Also, numerous 
studies have reported low reliabilities for direct assessment 
method5(e.g. Akeju, i97g)- However, Coff man ( 1966) and Godshalk, 
Swineford and Coffman a9S6) have shown that high reliability for 
direct methods can be obtained by requiring multiple samples and 
maitipie readings of each sample. 

Clearly, the two modes of assessing writing are satisfactory 
anr! urisat i sf act ory f-rbm different points of view. St i gg ins 19S2) 
noted that neither method is inherently superior. Rather the ase- 



fulriess of each varies accbrbirtE to the cbritext of the assessrnent 
arid the deciBions to be made. Thus, brie of the -^irst tasKS which 
Directors of large-scale test i rig procrams. rnust resi=«I ve is which 
met hoc < s) they should use to assess stadent s' wr it i ng skills^ 
esoecia] ly i;herj passing the test is a recuirernerit for high school 
graduat ibr». Because of cbst and other factors^ iricluding the 
an-'bunt of tirne availab]^, it niight be unlikely that ^rbre than brse 
essay can oe written aric scorec for eacn student- ^he low v-eilia- 
nility or one essay might argae for indirect methods. Yet, the 
face validity of indirect rnetnocs poses a pv^oblern — - will the 
public accept a test o-^ writing skills wnich aoes r-ot recuire 
5 1 uc ent s to write? 

For states which mandate that students must pass a writing 
test as par^t of the requirements for being awarded a high school 
diplbrna, the most important question is which form of writing 
assessrnent is better able to discriminate between competent and 
incc«mpetent writers- Because the stakes are high — denying a 
high school diploma — it is crucial that the assessment device 
usee be a reliable means of discriminating between those who will 
be awarded a dioloma and those who will not. 

Others have examined whether direct and indirect methods 
measure common or different skills- Yet, little research has 
examined the effect of a single or combination approach on the 
discriminant validity of writing assessment. Will a combination 
approach be more effective than using solely direct or indirect 
methods to discriminate between masters and rjonrnasters? =^nd^ i-^ 
so; in what combination? Because of the high school graduatibn 
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laws; sucni analyses becbrhe critical. 

"^'hiE> re?;e£trch e><arnirfed the -results from direct and incirect 
writing asse^^sment rnethods to determine the best procedure for 
ef feet ively d i scr i rni nat i ng between competent _i-,d incompeterft 
writers. To do this, it was necessary t deterrffine whether an 
essay, sy itself, or a multiple choice test. By itself, was an 
effective means of discriminating^ and whether in combination 
tney formea an even better rneans of d i scr i rni narion. 

Source 

In !*1arch 1983 a statewide writing test was administered for 
tne first time to approximately 94, 1300. ni nth grace stuaents in 
New Jersey public schools- That test was designed tdr measure 
students' ability to apply writing mechanics to written text and 
to cbmmuhicate effectively in writing. In order to assess both 
aspects of the students' writing ability, the test consisted of a 
54-item multiple choice section and a thirty minute essay. 

The multiple choice section consisted of items measurino 
skills in five cluster^: sentence structured 1 items}, usage <2l 
items), punctuat ioni:9 items), capi t ai izat ion <7 items), and spel- 
ling <6 items). Since the test was a rninimurrt competency test, the 
skills measured were minimum skills identified by educators in 
the state. All items were in a foui — choice format. Students 
received score for each cluster and for the total multiple 
choice section (the sum of the correctly answered items). 

For the essay sect ion, students were provided with a topic 
which they had not previously seen and were given 30 minutes to 
write no mbre than two pages of text. The t ? rne allotment included 



a Z' -rji 1 rtur s pr s— wri t i g per*^ i oc anc a 5 ^o.i r\ b ecitir-g per i oc . 

tdoic was aE fol lows: 

Think of Bornethirig you t^lought wIb ur«fair. It rni§'"^r r-e 
stpoiethirig unfair that nadperieo to yct».u to a frienG, or 
to sorneon? y?^* _° Know. Write ari essay that tells 

wnat happened and how you felt about it- 



The essays were scoren during a four day period oy £3'? Ni- w". 

teachers using a holistic scoring process. 2ach essay was read 

twice and scorec cn a 1-5 seals; thus^ a student's essay coul-r 

receive a score frorn c! to 12, the sum of the two readers'' scbres- 

in cases where two reaaers awarded scores which differed by three 

or more points^ the essay was read by a third reader to resolve 

the ciscreoancy. Dnly i . 5% of the essays needed a third reacing:. 

Students also rece ivec a score representing; a cor.obi nat ion of 

the two sections, weighted 605< (essay) and ^0:,<. <rrtul t i pie choice)* 

The total score was derived using a weighted surn of the 2~scores 

of the two sections^ Farther^ that total score was reported on a 

scale which rarfced from 42i-100 with a rnean of SiZi and a standard 

deviation of 10. 

In adcj.tion to the students' test scores, z^BcnWr ^ud§rnent 

data were collected about each student- Following the adrninistra- 

tion of the test, teachers were asked to provide their incs^pen- 

dent judgrnent about their students* writing skill competences The 

following instructions were provided to the teachers (N.J. Dept. 

of Educ. , 1983) 5 

To accomplish the task^ you will enter two ratings for each 
st udent on page 5 of hi s/her answer book! et in t ^-^e area marked 
'"^or Teachers Use Only.'' On the line labeled '*S-S. 1," you will 
enter a rating for the student >^elative to the ski lis asses sec bv 
the /n i.l t i p 1 e— ch o i ce sect i on of t h e _t e s t * p n t h e _ 1 i ne 1 a b» led 
"S.S-2, " you will enter a rati-'ss 'I'f the student's skill level m 
writ t en ex cress i on- 
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^ ..^"^^■'^ """^-^"fg? Gescride trie stucs^'-jr i or-e o'-^ -."ly-ee cB're- 

gor ies— -i^tasrer. Border! ihe, {vibh->'iaBt e'r--on t-e s<z .-: i= - r rr:=c:.l 
tior.. ^arn '•hi*' for ^•>^aBter" if, in your - udgment , "t he student has 
mastered most of the skills assessed- Mark "Ni" -f^or " }\ on-r»iaEt ev- " 
if, in your judgment^ the student nas not -nastered <-rt6st of me 
^^•i^ 1 5 - assessed, '^^ark"®^' -For Border i ne " if -r'-^f? sr'j.c€?nr nas 
so'V'e or the skills Dut cannot conficent ly be assiensc zo szt'-er 
the faster or \on-'Y'.ast er croup- 

In pBr^fci-rriiiY''ti znis task, olease remernder b ne fol^r^winqs 

<a) Jtjdge each student relative to the kinds ano levels of 

assessed or exhected by the tesr. -or the 
mult i pi e-c-!oice sect ion ( 3. 3. ^ } , you ^-iiay w:',si v-e ^'v-es-^' 
your memory_ as to t^e nature o"*^ the test cuestions 
provicing your rating, "or the essay ratine •b-S.G:)^ v^ate 
the student's mastery of written e><-:?re5£ioh relative 
you ael ieve to be appropriate or acecuate for tnc ninth 
crade- 

(b) gg_USt £ues5 about a studerft's skill level- 1^ you art: 

unfarni l iar with the student^ try to locate a 3ra^^ ^*ierncer 
who is safficiently knowleo^eable to provide a valic ratinz. 
If you are unable to make a confident judgment of a stu- 
dent's mastery and are unable to identify someone who can co 
so, please leave the circles bian;.'- 

For purposes of this research, a spaced sample of the 

statewide results was generatea. Students wno were classifieci in 
a special education or limited English speaking pr^ogr^s^rn were 
excluded from the sample, ru-ther, the resulis -^or students, w^o 
decarne ill or were disruptive daring a section(s) were excluced 
from the section^s). "^hus, the sarnpli size for each of the test 
sections was not identical. Based on a chi -square Soodne^ss of Fit 
test, it was determined that the sample Wr^s representative of the 
state in terms of the distribution of students in different 
socioeconomic types of school d i strict s. 

The sample consisted of information for 7326 students, ^dr 
escn St ucent , the f ol ? owing dat a were avai 1 abli^ ; mul t i 5le cnoice 
score, cluster scores, essay score, total score and tne teacners^ 
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rnast!?ry/" border: i ne/rjc^rirriast ery j ucGment Si • he csts we-e s/rialyssc: 
with the 19B2. versibr. o-r the Statistical ^r«aiysls Systsrr.- 

Table i presents surnmary statistics for each test section 
-For the sample as a whole and for the master, borderline and ho 
rnaster groups separately, ^any rnov-e students were juc^sc zo De 
r'lasters than sithev*^ ^orcerline or nonrnasters- -uv^ther. as anric 
pated, rhere was a rs'l at i ohsh 1 p rietwesn The rnastery grourjinzs 
the mean test scores. Stucencs wno werp ^udgecj to have a .vastey*^^ 
of the tested skills scor*»d higher, on the average, than studen-^ 
considered to dp -bo'rder 1 ine. Similarly, 5orcerlirte stucsnts 
scored higher than nbnrhasters. Finally, all sections of the 
rnaltiple choice test were relatively easy^ especially for the 
master groap. Si vert the very high mean scores for all groups on 
the capitalization cluster, very little information can be dis- 
cerned by examining that area. 

Reliability data for the test sections are presented in 
Table 2- For the multiple choice sfectidh, the reliability coeffi 
cients are Based on the K<.!dei — Richercsdn 20 formMla. Since relia 
bility is a function of test length, tne kR£0 estii.mtes for some 
of thfr clusters a;-e fairly low. The rel. ability estimate for the 
essay score is based on the interrater reliability, after the 
third -reacing for the essays which required such a reading. P 
pooled within-ceil c<:»rrelat ion was computed as the estimate of 
the inter-rater reliability (the inter-rater reliability was 
computed by National Evaluations Systems^ Inc. as 5art of their 
cointract with ths N.J. Department of Education.) 

*» _ 
.7. g 



Table 1 



Surnrnary Statistics For The Entire Samoie 
find Fcir The Master/Bordev-I ine/Nonrnaster Sroaps 



Test 


8roup 


N 


r'^ean 


r^ed i an 


S. D. 


Eis--.y 
^.1 item) 


:^aster 
Border 1 irie 
.Monmast er 
Total 


317S 
7162 


8. 0 
6. 6 

5- 2 
6. S 


0 
7 
S 
7 


i.a 
: . 9 

0 

£. 1 


^ctit iple 
Choice 
^5^ items) 


^taster 
Border 1 ine 
^^^onmast er 
Total 


33B1 

123S 
7138 


47.9 
42.9 
35. 6 

44. : 


49 
4-" 
37 
46 


5. 3 
5. 7 
3. -■ 
7. 7 


Sentence 
Struct Lire 
(11 items) 


blaster 
Border 1 ine 
Nbnmaster 
Total 


3381 

t£75 
7176 


9- 5 
3. 1 
6. 4 
8. ^ 


s 

6 
9 


1. 7 
£. 3 

2. 5 
2. S 


Usage 

iBt items) 


•>tast er 
Borderl ine 
iMonmast er 
Total 


33SE: 

1287 
7193 


:a.2 

15. 9 
13.2 
15. 3 


;g 
IS 

17 


2- 7 

3. 3 

4. i 
3. 9 


Punct- 
Uat i on 
(9 items) 


Master 
Borderl ine 
Nbnmaster 
Total 


3382 
2523 
1286 
7191 


8. 3 
7. 7 

6.8 
7.7 


9 
8 

7 
8 


1. e 

1.3 
1- 7 


Capitai- 
izat ion 
<7 items) 


Master 
Borderl ine 
NonmaB*t;er 
Tot a I 


3382 
2522 
1286 
7190 


6. 3 
6.5 
5. 1 
6 K 5 


"7 

7 

5 
7 


'Zi. s 

1?. a 

4 - 

l'. -2- 


Spel 1 ing 
(S items) 


Master 
Borderl ine 
Nbnmaster 
Total 


3378 
2519 
1280 
7177 


5; £ 
4. S 
4- £ 
4.8 


4 
5 


- * ~ 

1. 3 
1.2 



Table 2 

Reliability Coefficients for 
The Direct and Indirect Methods 

Essay M. C. S, S. Us. ^a. Ca. Sp. 

5^^iaw^iity 5Ji69 '5.90 tj. 76 0. 531 i^l, 57 •Z'. 57 'fJ. 43 
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B b I e J p V" o V 1 c e s t .1 e >j. i*".' c o v* r s-c t sc a ri c' c c- r r-e c t ec c c« t t 




coef f icierjt 3 -^br f^e cifferfc'?rit rest Eecrions. 


correct 


coe^"- 


-^^zcierits are pr-ovicec: to corv-ect ■tOr" at ter:uat ibn d 


i-i e ■*: ■!« Li 


nrel i a- 


D i i i t 


y. They were calculated :jy d i vie irio tne uncbr 


red" ec 


dorre I a — 


t iori 


coefficient by the product of the square root 


of the 


two 


•relevant rel iabi 1 i ty coef f icientsi 








' a 1^ .. e -J 

Uncorrect 3d -^x- Correct ec Corre 1 at i oh Coe-^f 1 i 
Between the Direct dc Indirect Methods 


a 

r i ent s 








Ca. 


r: ~ - 
^ y . 


Hssay 


1.0 .53 ■ . Gl? .63 . 




:. ^± 


C. 


.as 1 . s 






3. S. 


-30 1.0 .71 .55 


. S5 




Ub. 


.3^ .30 1,0 .57 


. 49 


. 45 


Pu. 


- 73 .34 .34 1.0 


« ^ 1 


. Al 


Ca* 


- sg . sa . 7£ .as 


1. 0 


. 35 


So. 


.71 .53 .72 .73 


. ST 


1. -2" 



a 

_?9^r^^?^i?''''^_=*3P?^^" above the diagonal : corrected 
correiat ioriB below the diagonal - Coef ficients are not 

included for the rei at ionsh i ps between the total rnultiple choice 
test and the five cluiterB becauie of the dependence between the 
cluBterB and the total rnultiple choice scoresi 

The correlations, especially those for the relationship 
between the essay and the total rnultiple choice sect ion anc 
some o the c i u st er s , are st r on g anc not i neon s i st ent w i t h the 
•^iv'jcincs f>-^orn ot^er studies ^e. g. bv-eiand Saynor, i'l*79! -^ogan 
■Yti Eh I ev^^ -9303 "^^bss. Cols ^ -^.nampp. 1 1 k i t Icrade 10 result si: ^ i93*S) : 

^ , . ii 

o 
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Bise^-'ial correlations were compere ed to compare tie stucehts' 
mastery cesi griat ic«ri unaster v. noKhnast er ) to t-ieir test sco'res 
^see Table ^) ^ Blserial cdrrel at ior?s t^^ere usee inste^ic o*^ thfr 
rnore convent i oral point z^i serial co'rrei at ions oecause it '^'as 
assumeo that the unGerlying distribution of the mastery jucgments 
was nbrrnally distributed (SI ass * Stanley, 1970)- 

From Taole 4, it is evident tnat tne total multiple c-.oic = 
section (as well as the sentence structure cluster anc tne u = -i;ce 
clusters) was a better discrirninatbr than the essay, "^his ^-^esult 
-oi^nt argue for inclirect r.iethocs instead or direct -nethocs .^or 
tests whose primary purpose is for mastery decisions. 

Table 4 

Biserial Cbrrel at ibrjs Exarnirfinn the Relationship 
Between Mast ery/Ndnmast ery and Test Scores 

Essay M. E. S- S- Us. Pu. Ca. So. 

Correlation 0.74 0.83 •in 78 0. 7S 0. S2 0-52 0.5^ 



To better exarnine the ability of the essay anc multiple 
choice tests to discriminate between the cbmpeterst and iricompe- 
tent writers^ passing scores were determined and the relatibnship 
of the pass/fail rate to the mastery decisions was examined- 
There exist many methbds fbr setting passing scores, some oased 
•on judgments about items and some based on judcmerfts about exami- 
nees. Livingston & fieky(19ai) provit2B an excellent =:escr i ot i ov'r 
of the major pr':<cec:ures usee to set passing scores; rhere-^ore^ a 
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ci£cu = siorj of the va-riouB 3T"ocecuv"eB i= sxcvcicsc -^-■c.'!-' r Htcicvi 

^assiriE scores were set basec On e tTor;-- v^ast i vso i3r oti 3 = 
prbcsGure- ~>ie pass ins score ie'c r-y v'"i?.t •iiet'ibc is r-ne scb're 
wH i ch oest e e par at e e t n e d i st r i but i bri of st ud erit s f ud g ed t b be 
MTSsters from the ci str i but iori of tho^se judged to :?e rjorimaEt ers. 
"c- cet Brm i 't'je vne ^ssEiriCi "^cov-es "^or esc^ test' Eecrior: T-ie c=tT5 
>i e r ^ a 'r« a 1 v 2 e c u s 1 z- a t c ci r C' ».'. 3 li i"* 1 ■»/ r i s t e c 1 s r i f; V' 5 ~ */ e 7. s 

^. E ec bn t- e r 5 -'. s b f h e c a r a i -%o f 1 3r * 1 SS*? ) . D'rj 1 v = t uc e^;"3 e 
cor.'E ic ered by their teachers to De rnasters br ribrrnas'u ers »were 
inclucec i'fi the arislysis. By i rfcl ud i 'rio only these two groups, 
i i e. those for wnorn t^ezr teacners we ^e certairi of t'~<e?,r riiaEtery/ 
no ri'j'' a st e r y st a t u s , arte ~?y e k c uc inn t n e Dor d er 1 1 ns 2 ^-^ ? r e 
sebar at: i on Det ween the c 1 ear I y cbmpet ent anc c 1 ear .1 y 1 ncornpe-b ent 
was rilore discernible. 

Table 5 presents the results of the Contrasting Groups 
analysis- That table also includes the percent of students jucoBd 
by their teachers to be nonmasters wno would have been Classified 
as rnasters based on their test score (false masters) and the per-- 
cent of students judged to be masters who wbuld have been classi- 
fied as nonmasters < f al se nonmast ers) * It al so incl udes the total 
percent of st adent s misciassi f iedi 

It should be clear that a goal in setting a passing score is 
to minimize the percent of students rni scl ass i f i ed. Depending upon 
the costs associated with each misclassi f icat ion error^ one may 
be more concerned with minimi z ing ei t her the proport i on of f al se 
rnasters or the proportion of false nonmastersi However, for this 
study, anc for most, because the costs were not discernible^ it 
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was asBurnec tnat the v^ssoect ive costs of rni sclas-sif icat lor: 
squai; thus, the major concerrs was the total p'roport ion of 
classified st ud ent s. 



'^able 5 





Passing Scores For the 


Direct & Indirect Methods 


Test 


n or 
I terns 


Cut 
n-F-F 

LJ T T 


% False 
•Masters 




:4 False" ^:«tal Percent 
Nonmastsrs < 3c ' ^'^^ -J f -J =d 


-ssay 1 


6 


44, 7r.c 






C. 


54 




50, 77< 






S. 3. 
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To obtain 


an i. 


inbiased est irnate 


of the percent of riizsclassi- 


f ied 


st adent s. 


the 


di scr irninant 


anal 


ysis was conducted on a 



randomly drawn sabsample of .30:4 of the original sample, dnce the 
passing scores were determined, the remaining ^0% of rhe sample 
was Used to estimate the percent of false masters and false 
nonmasters in the population. 

The data presented in fable 5 indicate that students were 
misclassif ied iaast often on the basis of the total multiple 
choice test. This rssult is in agreement with the biserial cor^ 
relation coefficient reported in Table 4. Thus, the multiple choice 
test by itself is a better means of discriminating between tne 
competent and incompetent writers than the essay is oy ztsel^. 

In all cases, there were a far greater percent of nonmasters 



mi scl assi f ied as masterE oassd drj tnei'r test 5c«3'r5E r~'a'r» iTastev'-s 
BO Misci assi f ied. This may suggest that teacners are rridre certai 
of tMeir jucnrnents about stucerits wno are competerft rarner thari 
those who are incornpetent • However, for the indirect methods, it 
may a I so ref 1 ect the rel at i ve easiness of the rnul t i pie choice 
test wnlch resulted in very high passing scores especially for 
some of the smaller clusters)* 

Considerec separate ly, the ind ir ect assessment met h r«c! is a 
slightly better procecure than the direct methoc for c isci^iniina- 
t ing between masters and nonrnasters. However, an important cues- 
tion to address is whether the multiple chc«ice and e^say score 
can 3e used in comDii*!at ion to make the pass/ fail decision more 
accurate <i-ei decrease the percent of false masters & false non- 
masters). P set of analyses were conducted to examine that issue. 

The first set of analyses examined the total test score and 
were similar to those conducted for the separate test sections, 
combination approach was used to determine the a priori classifi- 
cation of a student as a master or a nonmaster oecause DOtn tne 
essay and multiple choice scores were being usedi student was 
considered to be a mast er if he/she were j ud ged to be a master 
for both the essay and multiple choice sections. Likewise^ ' a 
student was considered jo be a nonmaster if he/she were judged to 
be a nonmaster on both sections^ Students for whom the judgment 
was not consistent were not included irt this analysis. Table 6 
provides information indicating the level of agreement between 
the mastery cecisions on both sections. In total the judgments 
•were in agreement for 6158 of the 7c:^2 students (85- * 
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__The top figure in parenthesis represents the colamn percent: 
the lower figure rep> >sent5 the row percent. 



The biseriai correlation between the mast er/nbnmast er groups 
arid the total test score was i^.81. This value was greater tnan 
the biseriai coefficient for the essay Out slightly less than 
that for the multiple choice test. Based on the Contrasting 
Groups procedure, a passing score of 76.9 was set- Using that 
f3assing score, 24i of the students in the nonrnaster group were 
rrtisclassif ied and 12. 2-/ of the masters were mi scl assi f ied. In 
total, 15. 7t!^ of the students were misclassif iedi Thus, the clas- 
sification of s:tudents based on a weighted combination of the 
tests was more accurate than that based on either procedure alone- 

The next set of analyses considerea the essay and rnultible 
choice test together^ but as separate entities. Analyses were 
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coriGUcted based on rnultiple decision -rules, i^irst, usirig the 
essay and multiple choice cut scores (see Table 5), I'n order for 
a student to 'pass', he/she had to nave an essay score at least 
equal to 6 and a multiple choice score at least equal to 43^ Sli 
other students were considered to 'fail'- Seconc, usino the essay 
a'nc the scores for each cluster^ a student was considered to 
' ^ass' if his/he-r score or eacn section was at least equal to the 
passing score. 

ft final analysis was conducted using only tne Sentence 
Struct u*re and Usage clusters- These two clusters were used be- 
cause based ovt the results of a stepwise d i scr i rri i nanT analysis, 
they were the only two wnich significantly added to the discrirtii — 
nant ability of the essay. That is, when using the essay, the 
sentence structure and the usage clusters^ the ability to dice— 
riminate between the masters and nbnmasters was better than the 
discriminat ion based solely on the essay. However, the addition 
of the other clusters rJid not significantly increase trie aoour^SLcy 
of the classification. 

Table 7 provides i nf or mat ion about these analyses* That 
table shows that there is an inr:rea5e in the accuracy of the cat^ 
egorization based on the multiple stage procedure, espe-'ially for 
the combination of t?ssay and cluster scores. This is probably due 
to the greater difficulty in ' passing' (students must sco're at 
least equal to the cut score on all sectionsy- Thus, there is a 
decrease in the percent of false masters. Whi le these results 
also drovide for a more accurate classification of st»j. dents com- 
pared to using either the essay or multiple cnoice test alone, 
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t-iSBB results do v-»or provioe ror ar- as acc-rare cl assi ri cat ibri as 
wnen tMe we-lghtec cornbiriaT 1 on of the two roeascrres zs corfSicered- 
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When a key purpose for developing a cornpetency test ii to 
make a decision about whether students should be awardec a Sic^ 
school diploma, it is critical that the test be as sensitive as 
possible to correctly classifying students as either masters or 
nonmasters. The errors mads by awarding or denying diplomas 
erroneously cannot be quantified, but clearly can be large. 

This study has shown that if a choice has to be made between 
using an iriGirect or a direct method, the indirect method pro- 
vides for a better rnean^: of discriminating between the competent 
and incompetent writer. However^ the use of either method alone 
begs the question as to whether either can sufficiently measure 
the area of writings A better approach, indeed the most approp- 
riate manner in which to determine whether a student is a master 
or a nonmaster, is by using a combination of both direct and 
indirect methods, creating a weighted total test score and basing 
the pass/faii decision on that total test score. 

o 

ERIC 



^.ks^.u^ S. Hi. j The rei i abi I i t y " of Senerai Cei-^T: 1 f icat e of £c:ucat i«orf 
Exarniriatibn English Cornposi t i ori Papers in West ftfrica. Jourr^ai 
of i^ilBjt^gj^ai iSlas^jre merit, 1972, 9, 175-iS'Z»- 

Breiai'jd, H-M- Car !'*1ui t i ple-Chpice tests measure writirfD SK.ilis? 
ZlL^ SSlS-SDe ^oar^. Hevi^ew, 1977, i.g3. 

Brelarid^ H, ivj^ ^ Sayror, J^t. ft comparison of direct and indirect 
asse'ssrn&nt s of writ ing ski lis. Journal of Ed^ca%4on^4 
!ll^asuj:ernent , 1979, 18, 1 19-1 28- 

Coffnlan, W. On tne validity of essay tests of achievement, 
^f: l^^^isl^Sii ^^^surement, 1966, 3, 151 -15b. 

Diecerici, P. B. ^ea&Uring i!^£wth i^n Ensl_i^5n, Ur-bs^riB-^ ill: 
National Council of Teachers of English, 197''^. 

Glass, B- V^ Sf Stanley, J. C. Statistical SliS^^l i^- I^y^i^ilon anc 
B^y£h9l^'£y' Engiewood C3iffs, N. J. : Prentice-Hall Inc. 19713. 

Godshalk, F. I., Swineford, F. E. & Cof fman, W- E. Jhe measuregien^ 
^£ ^riiil'^ ^^ill±Y? Princetons Edcicat i onal Testing Service, 196S 

'-rogan, T. g :>1i5hlert Relationships Detween essay rests anc 

objective tests of language ski 1 Is for elementary school 
students* SSiiSSai il£ . ISSCi, 1_7, £i9-££7 

^C'f i er^ ^ ^_9^P^^^^^^^ 9*^ approaches for set t i ng _ prof i c i ency 

standards, ^oarrjaj of fducatignaJ EsSS^SEBE^^^ 1980,17,151-178, 

Livingston, S. ft* & Hieky, f^. J. Passing Scores. Princeton, N. J. : 
Educat iovial Testing Service^ iSBBi " 

Moss, P.ft. , Cole, N. S. ^ S Khampalikit, G. ft comparison of 

procecures to assess written language skills at grades 4, 7, 
and 10. Journal of Educatiogal 3faf:yn^l^±5 1982, 19, 37 -AS!. 

New Jersey State Department o;^ Educat ion, ^ibi^k^ Basrc.Skiirs 
lest ina Proaramj^ gistrict lest Coordinatgri'^manuii, 1983. 

Nbyes^ E. S. ^ Sale, W. Mi & Stainaker, J. M. Regort, ©n tn& frs*^ s^h_ 
^.eSfeS kl^ iSaii-SK comeosit i^on. New York: fhe"^College Board" 1945 

Quellmalz, E. S. , Capell, F.J. S Chou, Chih-Ping. Effects of dis- 
course and response mode on the measurement of writing compe- 
tence. Journal of Educat:i_ona4_ Slll^Sl^lSi? 1SS2, 1^9, S4i-£5e. 

Spandel^ V. ^ St i gg ins, R. J. Direct Pleasyrgs of Stiii^S ^klllj^ 
Issues a^d Qaali^ationsi. Portland, Or: Ndrth^eit Regidnil 
Educat ional Laboratory, 198®. 

Stiggins^ R.J. ft comparison of direct and indi'-^ec-c writing 

assessment methods. ReseareK in the Teacninq of Brmlisn'^ I'^BS 
101-114. 



